Artificial Neural Networks Analysis Used to Evaluate the Molecular Interactions between Selected Drugs and Human Cyclooxygenase2 Receptor.

OBJECTIVE(S)
A fast and reliable evaluation of the binding energy from a single conformation of a molecular complex is an important practical task. Artificial neural networks (ANNs) are strong tools for predicting nonlinear functions which are used in this paper to predict binding energy. We proposed a structure that obtains binding energy using physicochemical molecular descriptions of the selected drugs.


MATERIAL AND METHODS
The set of 33 drugs with their binding energy to cyclooxygenase enzyme (COX2) in hand, from different structure groups, were considered. 27 physicochemical property descriptors were calculated by standard molecular modeling. Binding energy was calculated for each compound through docking and also ANN. A multi-layer perceptron neural network was used.


RESULTS
The proposed ANN model based on selected molecular descriptors showed a high degree of correlation between binding energy observed and calculated. The final model possessed a 27-4-1 architecture and correlation coefficients for learning, validating and testing sets equaled 0.973, 0.956 and 0.950, respectively.


CONCLUSION
RESULTS show that docking results and ANN data have a high correlation. It was shown that ANN is a strong tool for prediction of the binding energy and thus inhibition constants for different drugs in very short periods of time.


Introduction
Docking is a method which predicts the preferred orientation of one molecule to a second when bound to each other to form a stable complex (1). Docking is frequently used to predict the binding orientation of small molecule drug candidates to their protein targets in order to in turn predict the affinity and activity of the small molecule. Hence docking plays an important role in the rational design of drugs (2). Given the biological and pharmaceutical significance of molecular docking, considerable efforts have been directed towards improving the methods used to predict docking.
Two approaches are generally used for docking calculations. One approach uses a matching technique that describes the protein and the ligand as complementary surfaces (3). The second approach simulates the actual docking process in which the ligand-protein pairwise interaction energies are calculated (4).
In geometric matching the protein and ligand are described as sets of features that enable them to be docked. In one method receptor's surface is described in terms of solvent accessible surface area and the ligand's molecular surface is described in terms of matching surface description. Another method is to describe hydrophobic features of the protein using turns in main chain atoms. Yet another approach is to use a Fourier shape descriptor technique (5,6).
The simulation of docking is a much more complicated process. In this method ligand and receptor are positioned in a distance and the ligand is let to find its way into the active site with certain number of moves. The moves incorporate rigid body transformations such as translations and rotations.
After each move total energy of the system is calculated.
The Artificial neural network (ANN) analysis is a method of data analysis, which imitates the human brain's way of working. The power of ANNs has been shown over the years by their successful use in many types of problems with different degrees of complexity and in different fields of application. Neural networks represent the way in which arrays of neurons probably function in biological learning and memory (7). These networks are known as the universal approximations and computational models with particular characteristics such as the ability to learn or adapt, to organize or to generalize data. The learning of ANNs takes place by training with examples, "in a process that uses a training algorithm to iteratively adjust the connection weights between neurons to produce the desired input-output relationships" (8). It has been widely used in optimization, calibration, modeling and pattern recognition. ANNs are very useful in medical and pharmaceutical sciences, for example in diagnosis of diseases (9)(10)(11). Also ANNs have shown a good potential in calculation of physic-chemical and biological properties of drugs with more attention to pharmaceutical and chemical areas (12). In recent years many studies have been done in this field. Agatonovic-Kustrin and Beresford (13) reviewed the pharmaceutical applications of ANN method. ANN has been used to calculate aqueous solubility of drugs employing a number of molecular descriptors (14), and in other situations (15)(16)(17)(18).
It is proposed that by using artificial neural networks a set of descriptors can be incorporated to predict binding energy of final docking complex to facilitate and speed up screening processes.
The aim of this study was to design and test the appropriate ANN, which could allow predicting binding energy on basis of structural descriptors describing the structure of the selected basic drugs.

Structural parameters from molecular modeling
Descriptors of the structure of drugs were calculated by standard molecular modeling. Hyperchem® Ver. 8.5 for Windows® operating system was used. Geometry optimization was performed using molecular mechanics MM+ force field method and was followed by quantum chemical calculations according to semi-empirical AM1 method. Moreover, the set of structural descriptors was supplemented with Dragon Ver 4.5 software. The list of descriptors is presented in Table 3.

Docking
Autodock Ver 4.2 on Ubuntu Linux platform was used for docking. MGL tools Ver 1.5.4 was used for preparation and conversion of structures in Linux. COX2 (PDB ID: 6COX) was used as macromolecule and was set to rigid. The grid box was created with default 40x40x40 dots, each dot being 0.375Å, and was centered in the active site of the protein guided by presence of Celecoxib in original file. Number of general algorithm (GA) runs was set to twenty and the best result of each set with lowest binding energy was chosen. Structures were finally observed and examined using Swiss PDB Viewer Ver 4.0.4 and ViewerLite 4.2 in Windows 7.

Artificial neural network (ANN) analysis
An ANN involves the nodes that are known as neurons. The neurons are structured into a sequence of layers and connected to each other by using variable connection weights (12). Each layer can have a number of different neurons with various transfer functions (19). The first layer is the input layer with 27 nodes. The last layer is the output layer consisting of one node and a hidden layer containing 4 nodes is placed between input and output layers, where all three layers are responsible for learning process of the network.
The data were divided randomly into three groups. The first group was considered for training with 23 compounds. The second group was used for validation containing 5 compounds and testing set with 5 compounds. At the end of the training process, it is necessary to evaluate the capability of ANN model in prediction of other data. The validation set is used to monitor the performance of the model during the training phase and to minimize over fitting. Finally the test set is used to evaluate the trained neural network.
The input vector presented to an ANN is normalized between 0 and 1.
We used the multi-layer perceptron (MLP) network models with back propagation in which weighted sum of inputs and bias term are passed to the activation level through the transfer function to produce the output. Transfer functions can take any form and may be linear or non-linear (20). In this study transfer function in the first layer is the 'S' shaped logistic sigmoid whose general form is given as and transfer function in the second layer is linear. In this structure, functions can be well approximated. Back-propagation algorithm based on MATLAB's Neural Network Toolbox was used for ANN training. In this method, the output response is compared to a desired target response; if the actual response differs from the target response, the network generates an error signal, which is then used to calculate the adjustments that should be made to correct parameter weights, so that the actual output matches the target output. This algorithm is intended to change the weights until the error between output Learning was completed in 150 epochs by back propagation method. In order to decrease the sensitivity predicted results by ANN, to displacement of compounds in different sets; this experiment was done 40 times with diverse selections from training, validation and test data sets. Figure 1 represents the architecture of the ANN model used for predictions of binding energy.

Results
The list of values of the structural parameters of the drugs studied derived from calculation chemistry, reflecting their electronic properties, size (bulkiness), lipophilicity and other 2D and 3D parameters are summarized in Table 4. Table 3 as Figure 1 shows; we have 27 neurons in the input layer, 4 neurons in hidden layer and 1 neuron in the output layer. Thus the final model possessed a 27-4-1 architecture.
An ANN model was used to correlate binding energy behavior of the set of structurally diverse drugs with their structural descriptors and to create a model useful to prediction of binding energy.
Regression R values measure the correlation between outputs and targets. An R value of 1 means a close relationship while 0 means a random relationship. In Table 1 the correlation coefficients between experimented outputs and predicated outputs are presented. These results are the averages of 40 iterations for each set.
A correlation between docking and ANN binding energy values in learning, validating and testing set is given in Figure 2.

Discussion
Results show that Autodock and ANN data have a high correlation. As seen in Table 1, the accuracy of the results increases with augmentation of hidden layer nodes. On the other hand, we achieved a good result and there was no need to increase neurons in hidden layer. Thus model 27-4-1 is a good structure. Table 2 shows the information about errors between target and output.

Conclusion
In present study, a set of 27 descriptors is adopted to build a model to describe docking energy of 33 drugs of diverse chemical structure with antagonistic effects on COX2 enzyme. We built a structure using neural networks which predicts binding energy and developed a multi-layer perceptron artificial neural network (ANN) model, which has been trained by back propagation algorithm. Results show that docking results and ANN data have a high correlation. As presented in Table 1, correlation coefficients for learning, validating and testing sets equaled 0.973, 0.956 and 0.95, respectively. Also the error between